Researching Data Science Education: Perspectives on Qualitative Research Methods

Dr. Allison Theobold

Today’s layout


  1. Some background on qualitative research
  1. Qualitative investigations into student’s code
  1. Qualitative investigations into group work
  1. Implications for oral assessments

A bit about me…

Ph.D. in Statistics from Montana State University

“Supporting Data-Intensive Environmental Science Research: Data Science Skills for Scientific Practitioners of Statistics”

Land acknowledgement

The Research Triangle sits on the territory of several Native nations, including the Tutelo and Saponi speaking peoples. We acknowledge, respect, and thank the tribes on whose stolen land we are guests.

Indigenous people are not relics of the past. We who work and live here must acknowledge past violence and ongoing harm produced by the continued effects of colonization.

Qualitative research

“Qualitative researchers strive to understand the meaning people have constructed about their world and their experiences.” (Sharan B. Merriam 2002)


“Qualitative research is an effort to understand situations in their uniqueness as part of a particular context and the interactions there. This understanding is an end in itself.” (Patton 1990)

What are the principles of qualitative research?


  • The researcher is the primary instrument for data collection and data analysis

  • The analysis seeks to find emerging themes

  • The product of a qualitative study is richly descriptive

How might this look?


Sample Selection

Select a sample from which the most can be learned!

Data Collection

Major sources of data – interviews, observations, documents

Data Analysis

Compare units of data to find common patterns across the data

Investigating student learning through code

Warm-up (90 seconds)


RPMA2GrowthSub$Weight[RPMA2GrowthSub$Age == 1]


How would you describe the action(s) being taken in this statement?

A framework for analyzing student’s code (Schulte 2008)

Text Surface Program Execution Function
Macrostructure Understanding the overall structure of the program Understanding the “algorithm” of the program Understanding the goal / purpose of the program (in its context)
Relations References between blocks, e.g., method calls, object creation Sequence of method calls, object sequence diagrams Understanding how sub-goals are related to goals, how function is achieved by subfunctions
Blocks Regions of interest (ROI) that syntactically or semantically build a unit Operation of a block, a method, or a ROI (as a sequence of statements) Function of a block, may be seen as a sub-goal
Atoms Language elements Operation of a statement Function of a statement, only understandable in context

Coding student’s code


RPMA2GrowthSub$Weight[RPMA2GrowthSub$Age == 1]


Descriptive code

“Filters a vector of values using extraction operator, based on an equality relation with a variable selected from dataframe using $ operator”

In-vivo code

“Uses [ ] and == to filter vector, uses $ to select variable”

Uncovering emergent themes

linearAnterior <- lm(PADataNoOutlier$Lipid ~ PADataNoOutlier$PSUA)

early <- subset(RPMA2Growth, StockYear < 2006)  

Weight5 <- mean(RPMA2GrowthSub$Weight[RPMA2GrowthSub$Age == 5], na.rm = TRUE)

gas <- gas[!(substr(gas$sampleID,3,3) %in% c("b","c")), ]   

obsD <- subset(gas, gas$carboy == "D")$N15_N2_Ar

lowerCIBound <- pMat[1:mlleIndex,1][which.min(abs(mlleCI+likelihoods[1:mlleIndex]))]

Data wrangling

Statements of code whose purpose is to prepare a dataset for analysis and / or visualization

Sub-themes

  • selecting variables
  • filtering observations
  • mutating variables

An alternative direction


Process coding:

uses gerunds (“-ing” words) to connote action in the data (Saldana 2013)


  • Particularly relevant to describing the processes of human actions
  • Can be intertwined with time, such that actions can emerge, change, or occur in particular sequences.

Practical considerations

How much code should I collect?

  • Driven by the research question!
    • Amount of each student’s code
    • Number of students

How do readers trust my analysis?

  • Trust comes from:

    • confirmability
    • reliability
    • credibility
    • transferability


Excellent resources: Creswell & Poth (2018); Merriam & Tisdell (2016); Miles et al. (2020)

How could this be used?

Concept dependence

How does a student’s concept model of a dataset inform how they filter data?

(atoms; program execution)

Program environment

How do the visualizations produced by students who learn ggplot differ from those who learn “base” R?

(blocks; program execution)

Linguistic structure

How do students name objects they will use later?

(relationships; text)

Learning trajectory

How do students’ exploratory data analyses change over the duration of a course?

(macrostructure; function / purpose)

Why is this important for data science education?

Theobold et al. (2023)


How can we distinguish merely interesting learning from effective learning (Wiggins and McTighe 2005)?

Investigating group collaborations

Another warm-up 😊

(2-minutes)

Read through the dialogue on the back page of the handout. Consider how students are using language to:

  • Build community and / or collaboration
  • Position (mathematical) thinking as significant

Discorse Analysis

is the study of language (structure, form, and syntax). Paired with the study of language-in-use, discourse analysis allows us to study how language is used to create and enact identities (Gee 2014).


Language is used to build:

  • Significance
  • Practices
  • Culturally supported activities
  • Identities
  • Relationships
  • Politics
  • Connections
  • Sign systems and knowledge

Revisiting Uma & Sean


This time consider if and how Uma’s language differs from Sean’s to accomplish the two building tasks:

  • Build collaboration for mathematics learning or problem solving. How are collaborations being fostered or discouraged?

  • Establish mathematical significance. Whose thinking is being positioned as significant? By whom? How?


Be prepared to share!

Results & Claims

(Allison S. Theobold and Williams 2023)

“Fights over who gets to speak and whose words are recognized are indicative of power and status” (Johnson, 2002)


The Influence Framework Engle et al. (2014)

  • treated as a credible source of information

  • ideas are positioned as high / low quality

  • access to the conversational floor

  • degree of spatial privilege

A Discourse Scorecard

Proposal negotiation units (PNUs)

begin when one group member makes a proposal bid for pursuing a mathematical approach or posing a directive to other group members


Intellectual Merit Intellectual Authority Directive Authority Influence
Uma 6 (13) 19 (38) 0 (2) 0
Sean 38 (12) 47 (10) 4 (0) 4

What can qualitative research teach us about oral exams?

Oral exams

(Allison S. Theobold 2021)


  1. You are a qualitative researcher!

  2. You are both the tool for data collection and data analysis.

  3. Your analysis must be trustworthy

Collecting data

  • How will you decide what questions to ask?
  • How will you address the nature of how the data are collected?
  • How will you address your own positionality?

Analyzing data

  • How do you decide how “well” someone did?
  • How “stable” are your results?

Questions?

References

Corbin, Joseph, and Allan Strauss. 2008. Basics of qualitative research: Techniques and procedures for developing grounded theory. Thousand Oaks: Sage.
Creswell, J. W., and C. N. Poth. 2018. Qualitative Inquiry & Research Design. Thousand Oaks, CA: Sage.
Engle, Randi A., Jennifer M. Langer-Osuna, and Maxine McKinney de Royston. 2014. “Toward a Model of Influence in Persuasive Discussions: Negotiating Quality, Authority, Privilege, and Access Within a Student-Led Argument.” Journal of the Learning Sciences 23 (2): 245–68. https://doi.org/10.1080/10508406.2014.883979.
Gee, James Paul. 2014. How to Do Discourse Analysis. 2nd ed. Abingdon, Oxon: Routledge.
Merriam, S. B., and E. J. Tisdell. 2016. Qualitative Research. San Francisco, CA: John Wiley & Sons.
Merriam, Sharan B. 2002. Qualitative Research in Practice: Examples for Discussion and Analysis. 1st ed. New York: John Wiley & Sons.
Miles, M. B., A. M. Huberman, and J. Saldaña. 2020. Qualitative Data Analysis. Thousand Oaks, CA: Sage.
Patton, Mary Q. 1990. Qualitative Evaualuation Methods. 2nd ed. Thousand Oaks: Sage.
Saldana, J. 2013. The Coding Maual for Qualitative Researchers. Thousand Oaks: Sage.
Schulte, Carsten. 2008. “Block Model.” Proceedings of the Fourth International Workshop on Computing Education Research, September. https://doi.org/10.1145/1404520.1404535.
Theobold, Allison S. 2021. “Oral Exams: A More Meaningful Assessment of Students Understanding.” Journal of Statistics and Data Science Education 29 (2): 156–59. https://doi.org/10.1080/26939169.2021.1914527.
Theobold, Allison S., Megan M. Wickstrom, and Stacey A. Hancock. 2023. Coding Code: Qualitative Methods for Investigating Data Science Skills.”
Theobold, Allison S, and Derek A. Williams. 2023. ’She’s probably contributing more than I gave her credit’: A feminist perspective on group discourse.”
Wiggins, G., and J. McTighe. 2005. Understanding by Design. 2nd ed. Alexandria: Association for Supervision; Curriculum Development (ASCD).